ࡱ> =?<7 bjbjUU "f7|7|slvvvvvvv 8  $92r48888888$V: v<p8v8@vv8@@@vv8@8@@"%1hvvO2 !zP ~1O2T80$91X<@<O2@J6vvvvHYPERLINK "Huffman.htm"Previous lectureHYPERLINK "..\\GIF\\LZ.htm"Next LectureHYPERLINK "..\\420Syllabus.htm"SyllabusHYPERLINK "..\\..\\HW\\Week3\\cs420_3_2004.htm"Homework Other stuff not covered in class:  HYPERLINK "Probability.htm" Review of probability HYPERLINK "Channels.doc" Channels Information Theory Information theory assumes that the generation of information is a probabilistic process. A random event E that occurs with probability P(E) contains I(E) units of information, defined by: I(E) = log(1/P(E)) = - log(P(E)) The less likely an event is, the more information it contains. Typically, we will use a base 2 logarithm and the units are in bits. If P(E) = , then I(E) = -log2() = log2(4) = 2 bits of information. Assume that an information source can produce the set A = {a1, a2, aJ} of symbols, called the source alphabet. For example, these could be pixel values. The probability that the source will produce symbol aj is P(aj). The amount of information from generating this symbol is I(aj) = -log P(aj) If a large number, k, of symbols are generated, then on average each symbol will occur k P(aj) times. The average amount of information from the k outputs will be  EMBED Equation.3  The first order entropy of the information source is defined to be  EMBED Equation.3  If we look at pairs of symbols, we get the second order entropy:  EMBED Equation.3  The entropy of the source is defined to be the limit as we take longer and longer sequences of symbols. If there is no correlation among the symbols (they are independent and identically distributed) then the entropy is the same as the first order entropy. Entropy is the average amount of information provided by the source. The entropy is maximized when all symbols are equally probable. We can count the number of times each gray level occurs in an image (a histogram) and use that as an estimate for the probability and use the above equation to compute H. This is called the first-order estimate of the entropy. It is an estimate since we are using estimated probabilities. The maximum amount of compression achieved by just using variable-length coding (such as Huffman) cannot exceed the first-order estimate of the entropy. Example Consider an image: 21212195169243243243212121951692432432432121219516924324324321212195169243243243 The histogram and first-order estimates are: Gray LevelCountProbability21123/89541/816941/8243123/8 H = -(3/8) log2(3/8) - (1/8) log2(1/8) - (1/8) log2(1/8) - (3/8) log2(3/8) = [-3/4 log(3/8) log(1/8)] / log(2) = [-0.75 * -0.42597 0.25 * -0.90309] / 0.30103 = 1.811 bits / pixel Remember that log2(x) = log(x) / log(2) Since there are 32 pixels in the original image, the best compression that we could get by variable length coding would be 32 * 1.811 = 58 bits. The second-order estimate of the source entropy is obtained by looking at pairs of pixels in raster scan order and assuming that the end of the image wraps around to the front. This gives: Gray Level PairCountProbability(21, 21)81/4(21, 95)41/8(95, 169)41/8(169, 243)41/8(243, 243)81/4(243, 21)41/8 H = 1.25 bits / pixel The difference between the higher order estimates of entropy and the first order estimate give the amount of interpixel redundancy. If the pixels are statistically independent, then the entropy estimates at all levels are equal and lossless compression is optimized using variable-length coding. In this case, the difference between first and second order is 1.81 - 1.25 = 0.56 bit / pixel, so we can do better. If we use a subtraction filter, we can transform the original image to 210074747400210074747400210074747400210074747400The first-order estimate on the subtraction filtered image is Gray LevelCountProbability0161/22141/874123/8H = -(1/2) log2(1/2) - (1/8) log2(1/8) - (3/8) log2(3/8) = 1.41 Thus we can represent this image with 32 * 1.41 = 46 bits. Since this filter does not achieve the second-order entropy of 1.25 bits / pixel, we know that it is possible to do even better. Coding Efficiency We can compute the average length of a code by adding up the number of bits in each symbol times the probability of that symbol. The efficiency of a code is the ratio of the entropy to the average code length. For example, if the entropy is 2.14 bits/symbol and a Huffman code has average length of 2.2 bits/symbol, then the efficiency is 0.973. If symbols must be coded one at a time, the Huffman code is the most efficient possible. If we take advantage of inter-pixel redundancy, we can do better. Unique Decodability A code is no good if we cant unambiguously decode it. Consider this example: LettersProbabilityCode 1Code 2Code 3Code 4e0.50000t0.25011001s0.125100110011n0.12510111110111Average length:1.1251.251.751.875 Code 1 is the most efficient since it does the most compression. But it is worthless since we cant distinguish e and t. Code 2 is ambiguous since 100 could mean either tee or ts. Codes 3 and 4 are unambiguous. With code 4, the receiver does not know that it has a complete letter until it receives a 0 starting the next letter. Code 3 is instantaneous, since the receiver can recognize any letter when the last bit is received. Binary codes can be represented by trees as in the figure on page 65 of Miano. Code 3 has the property that all codes are leafs of the tree. This is good and is called a prefix code. For any unambiguous non-prefix code, it is always possible to find a prefix code that gets as much compression. CS 320  PAGE 1 *+,-HIJVWXYxyz678@AABEFKLet !-.CD6H*6]jzUjUjUjQUjU0JjU jUJ,XBigg$$Ifl\:,"04 la$If sBCDW9x0/ k$$Ifl0,"LL064 la / 0 C D E F r  !23i~#F<Zz{0JmHnHu0J j0JUH*5>* j^ EHUj<> CJUVaJ jEHUj= CJUVmHnHu66] jU j4EHUj3= CJUV?/ G ( L        $If  ! $ #t$If$$Iflִ\$4 t @@@@0    4 la$ ' * . 2 6 : $If: ; > A #t$If$$Iflִ\$4 t @@@@0    4 laA D G K O S W $IfW X [ ^ #t$If$$Iflִ\$4 t @@@@0    4 la^ a d h l p t $Ift u v #!!$$Iflִ\$4 t @@@@0    4 la |,sss|(sss|,s $$Ifa$|$$IflFG ZYa0    4 la$If 9by0ywwwwww|$$IflFG ZYa0    4 la $$Ifa$ cd#39EFOQUV_az@qqqz@qq $$Ifa$|$$IflF: MYa0    4 la$If aefprvwyDyHyHyD|$$IflF: MYa0    4 la $$Ifa$ywwwwwwqqq$If|$$IflF: MYa0    4 la $$Ifa$ $If#T$If$$Iflִ\ L t 0    4 la$If#T$If$$Iflִ\ L t 0    4 la$If#T$If$$Iflִ\ L t 0    4 la$If<#!$$Iflִ\ L t 0    4 la<GMYZ\_cdgimnq|(sss|(sss|,s $$Ifa$|$$IflFG ZYa0    4 la$If qtxyw  ^_ywwwwuwwwww|$$IflFG ZYa0    4 la $$Ifa$ _ $If !#%'H<BBBBBB$If$$Iflֈ  &064 la'(*/1369HHBBBBBB$If$$Iflֈ  &064 la9:<BDGKOHXBBBBBB$If$$Iflֈ  &064 laOPRX[^bgH`BBBBBB$If$$Iflֈ  &064 laghx~HBBBBB$If$$Iflֈ  &064 laOIsYWWWWWWUW$$Ifl4r  &064 la  1h/ =!"#$%DyK F Huffman.htmHuffman.htmDyK F GIF\LZ.htmDyK F 420SYL~1.HTM$420Syllabus.htmDyK FHW\Week3\cs420_3_2004.htmDyK F PROBAB~1.HTM$Probability.htmDyK F Channels.docChannels.docDd\P  S A? "2-؋e!{x`!-؋e!{` `0PxڍSO@~ڂdC耉n@1&TL !:'6 D &29hZ6ͽw@@Qa /nȈ8"Z$ٰWh.0 21d%(jldJud@E f46-d!Yy.g_)MЁBanc5K,_[!+AEy5QA] Qd8UEDmOًp_dӚiHN R;$.E@v w6؟+aΐxxö֝; 槯X*<G=4SܟGg/wkUq{$%1 t#nAIZŒY{c,ƽQ M-*rDdt B  S A? 2 ٭qZ$rqH0`! ٭qZ$rqHz `qP~xSMK@icA!z{X +?hBrѿɋBUznѸ;YԄ%vͼ00g0Iet1a!)!Yx9n& q[|9Y&+cL8.*#I1R`U.]i:msuX"%EOf.(0^#ghܪy*>g<&#FiX}{J˧TĦ&dfJ>\S}UiCXHGx*O:}_U᛬G oDܿ5MUg^ &5!?~ԙ-ɏM?hPefhi8d Ro`+NPըQ~FDd P  S A? "2B:?Kaᛟ[^ `!:?Kaᛟ[^@`dPxڅ;KPϽIVUAġ:-MtQTbX+J6A'C;c(RޓI#!rϽ!6(A=>jY"Tu6IUloY#.AqYzEc`9y/~8(q-\g.ΞT9@"̻c yf5֛qg~ :eaSLkIuj-x9QuhQ)"YUdu錒 (IcHy0,PǴ%Z q޵۹^"no/e#^vs0ZUþ^OA7vq^"׋>7/vBNhtOHN.<*kJ/CF5CԔbydZABCEDFGHIKLMNOPQRSTUVWXYZ[\]^_`abcdefghRoot Entry FyP@ Data 4WordDocument"fObjectPool gPyP_1033909180FgPgPOle CompObjfObjInfo  !"#$%&'()*+,. FMicrosoft Equation 3.0 DS Equation Equation.39q>{ "kP(a j ) j=1J " logP(a j )Equation Native _1034353907 FЫpPЫpPOle CompObj f FMicrosoft Equation 3.0 DS Equation Equation.39qπ~IԀI H="P(a j ) j=1J " logP(a j ) FMicrosoft Equation 3.0 DS EqObjInfo Equation Native  _1050995004FfuPfuPOle CompObjfObjInfoEquation Native 1TableJ<uation Equation.39q8l H 2 ="P(a i a j ) j=1J " logP(a i a j ) i=1J "Oh+'0 SummaryInformation(DocumentSummaryInformation8CompObj-j( D P \ ht| Probabilityrob Tyler Folsomoyleyle Normal.dotm Tyler Folsomo54eMicrosoft Word 9.0@@E@>@VPg՜.+,D՜.+,T hp   DigiPen Institute of Technology{{  Probability Title\ 8@ _PID_HLINKSA$A Channels.doco0 Probability.htm  ..\..\HW\Week3\cs420_3_2004.htm+..\420Syllabus.htmU..\GIF\LZ.htme( Huffman.htm  FMicrosoft Word Document MSWordDocWord.Document.89q i8@8 NormalCJ_HaJmH sH tH T@T Heading 2$<@& 56CJOJQJ\]^JaJNN Heading 3$<@&5CJOJQJ\^JaJ<A@< Default Paragraph Font.U@. Hyperlink >*B*ph>V@> FollowedHyperlink >*B* ph,, Header  !, @", Footer  !&)@1& Page Number6BB6 Body TextCJ$OJQJ^Jf z z z z ze x,XBCDW9x0/G(L        ! $ ' * . 2 6 : ; > A D G K O S W X [ ^ a d h l p t u v 9 b c d # 3 9 E F O Q U V _ a e f p r v w <GMYZ\_cdgimnqtxyw  ^_ !#%'(*/1369:<BDGKOPRX[^bghx~OIs00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000K0K0K00K0K0K0K00X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X0X00B/  $ : A W ^ t a<q_'9Og !"#$%&'()*+,-./012*,IVXy7@/CEXXXXXX:::!tcs320_Dictionary _Hlt513537238 _Hlt496510091 _Hlt513537246Info Sub_filter!"QDo@@@"#RDo+,WXDJL!,.1 ; :BH\JLx|sz+,WXD B F 2:BH\():;PQIx{|sz::::::::::::::::: Tyler FolsomeC:\Winnt\Profiles\tfolsom\Application Data\Microsoft\Word\AutoRecovery save of Information Theory.asd Tyler Folsom=F:\DigiPen\Fall2003\CS420\Lectures\RLE\Information Theory.doc Tyler Folsom>F:\DigiPen\Fall2003\CS420\Lectures\JPEG\Information Theory.doc Tyler FolsomeC:\Winnt\Profiles\tfolsom\Application Data\Microsoft\Word\AutoRecovery save of Information Theory.asd Tyler Folsom>F:\DigiPen\Fall2003\CS420\Lectures\JPEG\Information Theory.doc Tyler Folsom>F:\DigiPen\Fall2003\CS420\Lectures\JPEG\Information Theory.doc Tyler FolsomeC:\Winnt\Profiles\tfolsom\Application Data\Microsoft\Word\AutoRecovery save of Information Theory.asd Tyler Folsom>F:\DigiPen\Fall2003\CS420\Lectures\JPEG\Information Theory.doc Tyler Folsom>F:\DigiPen\Fall2003\CS420\Lectures\JPEG\Information Theory.doc Tyler Folsom@F:\DigiPen\Winter2004\CS420\Lectures\JPEG\Information Theory.doc1VXVi)P$r300Wt5K@:Anh ^`OJQJo(h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(h ^`OJQJo(h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(<^`OJPJQJ^Jo(  tt^t`OJQJo(o DD^D`OJQJo(   ^ `OJQJo(   ^ `OJQJo(o ^`OJQJo( ^`OJQJo( TT^T`OJQJo(o $$^$`OJQJo(h ^`OJQJo(h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(h ^`OJQJo(h ^`OJQJo(oh pp^p`OJQJo(h @ @ ^@ `OJQJo(h ^`OJQJo(oh ^`OJQJo(h ^`OJQJo(h ^`OJQJo(oh PP^P`OJQJo(t5301Vi)@:                  ,XBC        ! $ ' * . 2 6 : ; > A D G K O S W X [ ^ a d h l p t u v c d # 3 9 E F O Q U V _ a e f p r v w <GMYZ\_cdgimnqtxy !#%'(*/1369:<BDGKOPRX[^bghx~s@ 1@{@UnknownGz Times New Roman5Symbol3& z Arial?5 z Courier New;Wingdings"1hJFMc&K6g{%0d{ 2Q Probability Tyler Folsom Tyler Folsom